Tags: llm* + machine learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. In this article, we will explore various aspects of BERT, including the landscape at the time of its creation, a detailed breakdown of the model architecture, and writing a task-agnostic fine-tuning pipeline, which we demonstrated using sentiment analysis. Despite being one of the earliest LLMs, BERT has remained relevant even today, and continues to find applications in both research and industry.
  2. This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.
  3. This paper introduces Cross-Layer Attention (CLA), an extension of Multi-Query Attention (MQA) and Grouped-Query Attention (GQA) for reducing the size of the key-value cache in transformer-based autoregressive large language models (LLMs). The authors demonstrate that CLA can reduce the cache size by another 2x while maintaining nearly the same accuracy as unmodified MQA, enabling inference with longer sequence lengths and larger batch sizes.
  4. Google has launched Model Explorer, an open-source tool designed to help users navigate and understand complex neural networks. The tool aims to provide a hierarchical approach to AI model visualization, enabling smooth navigation even for massive models. Model Explorer has already proved valuable in the deployment of large models to resource-constrained platforms and is part of Google's broader ‘AI on the Edge’ initiative.
    2024-05-20 Tags: , , , by klotz
  5. Stay informed about the latest artificial intelligence (AI) terminology with this comprehensive glossary. From algorithm and AI ethics to generative AI and overfitting, learn the essential AI terms that will help you sound smart over drinks or impress in a job interview.
  6. Researchers from NYU Tandon School of Engineering investigated whether modern natural language processing systems could solve the daily Connections puzzles from The New York Times. The results showed that while all the AI systems could solve some of the puzzles, they struggled overall.
  7. This article discusses the process of training a large language model (LLM) using reinforcement learning from human feedback (RLHF) and a new alternative method called Direct Preference Optimization (DPO). The article explains how these methods help align the LLM with human expectations and make it more efficient.
  8. - standardization, governance, simplified troubleshooting, and reusability in ML application development.
    - integrations with vector databases and LLM providers to support new applications -
    provides tutorials on integrating
  9. This article provides a beginner-friendly introduction to Large Language Models (LLMs) and explains the key concepts in a clear and organized way.
    2024-05-10 Tags: , , , , , by klotz
  10. • A beginner's guide to understanding Hugging Face Transformers, a library that provides access to thousands of pre-trained transformer models for natural language processing, computer vision, and more.
    • The guide covers the basics of Hugging Face Transformers, including what it is, how it works, and how to use it with a simple example of running Microsoft's Phi-2 LLM in a notebook
    • The guide is designed for non-technical individuals who want to understand open-source machine learning without prior knowledge of Python or machine learning.

Top of the page

First / Previous / Next / Last / Page 2 of 0 SemanticScuttle - klotz.me: tagged with "llm+machine learning"

About - Propulsed by SemanticScuttle